skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Sim, Alex"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This study focuses on improving the preparation of spectral data for machine learning. It does so by conducting a case study that involves matching an airborne gamma-ray spectral survey of the San Francisco Bay area to geological classifications provided by the United States Geological Survey (Graymer et al., 2006).Our investigation has revealed three key approaches for enhancing accuracy in this task:1) eliminating extraneous data segments unrelated to the main task,2) augmenting minority classes to improve class balances,and 3) merging inconsistent classes.By incorporating these methods, we were able to achieve a significant increase in classification accuracy. Specifically, we increased the accuracy from an initial 40.8% to approximately 72.7%. We plan to continue our work to further enhance performance, with the goal of extending the applicability of these methods to other data types and tasks. One potential future application is the detection of rare earth elements from aerial surveys. 
    more » « less
  2. null (Ed.)
    The volume of data moving through a network increases with new scientific experiments and simulations. Network bandwidth requirements also increase proportionally to deliver data within a certain time frame. We observe that a significant portion of the popular dataset is transferred multiple times to different users as well as to the same user for various reasons. In-network data caching for the shared data has shown to reduce the redundant data transfers and consequently save network traffic volume. In addition, overall application performance is expected to improve with in-network caching because access to the locally cached data results in lower latency. This paper shows how much data was shared over the study period, how much network traffic volume was consequently saved, and how much the temporary in-network caching increased the scientific application performance. It also analyzes data access patterns in applications and the impacts of caching nodes on the regional data repository. From the results, we observed that the network bandwidth demand was reduced by nearly a factor of 3 over the study period. 
    more » « less